Group LMMSE demodulation using noise and interference covariance matrix for reception on a cellular downlink

ABSTRACT

A method for filtering in a wireless downlink channel, where all dominant transmitting sources use inner codes from a particular set, includes the steps of estimating a channel matrix seen from a desired transmitter source in response to a pilot or preamble signal; converting the estimated channel matrix into an effective channel matrix responsive to the inner code of the desired transmitting source; estimating a covariance matrix of noise plus interference in a linear model whose output is an equivalent of the received observations and in which the effective channel matrix corresponding to each dominant transmitting source inherits the structure of its inner code; computing a signal-to-noise-interference-ratio SINR responsive to the covariance matrix and the effective channel matrix corresponding to the desired source; and feeding back the computed SINR to the transmitter source.

This application claims the benefit of U.S. Provisional Application No.60/894,555, entitled “Analysis of Multiuser Stacked Space-TimeOrthogonal and Quasi-Orthogonal Designs”, filed on Mar. 13, 2007, isrelated to U.S. patent application Ser. No. 12/047,527, entitled “GROUPMMSE-DFD WITH ORDER AND FILTER COMPUTATION FOR RECEPTION OF A CELLULARDOWNLINK”, filed on Mar. 13, 2008; related to U.S. patent applicationSer. No. 12/047,544, entitled “GROUP MMSE-DFD WITH RATE (SINR) FEEDBACKAND PRE-DETERMINED DECODING ORDER FOR RECEPTION OF A CELLULAR DOWNLINK”,filed on Mar. 13, 2008; and related to U.S. patent application Ser. No.12/047,555, entitled “GROUP MMSE-DFD WITH RATE (SINR) FEEDBACK ANDWITHOUT PRE-DETERMINED DECODING ORDER FOR RECEPTION OF A CELLULARDOWNLINK”, filed Mar. 13, 2008: all of which their contents areincorporated by reference herein.

BACKGROUND OF THE INVENTION

The present invention relates generally to wireless communications and,more particularly, to group linear minimum mean-squared error (LMMSE)demodulation using noise and interference covariance matrix forreception on a wireless downlink.

A wireless cellular system consists of several base-stations or accesspoints, each providing signal coverage to a small area known as a cell.Each base-station controls multiple users and allocates resources usingmultiple access methods such as OFDMA, TDMA, CDMA, etc., which ensurethat the mutual interference between users within a cell (a.k.a.intra-cell users) is avoided. On the other hand co-channel interferencecaused by out-of-cell transmissions remains a major impairment.Traditionally cellular wireless networks have dealt with inter-cellinterference by locating co-channel base-stations as far apart aspossible via static frequency reuse planning at the price of loweringspectral efficiency. More sophisticated frequency planning techniquesinclude the fractional frequency reuse scheme, where for the cellinterior a universal reuse is employed, but for the cell-edge the reusefactor is greater than one. Future network evolutions are envisioned tohave smaller cells and employ a universal (or an aggressive) frequencyreuse. Therefore, some sort of proactive inter-cell interferencemitigation is required, especially for edge users. Recently, it has beenshown that system performance can be improved by employing advancedmulti-user detection (MUD) for interference cancellation or suppression.However, in the downlink channel which is expected to be the bottleneckin future cellular systems, only limited signal processing capabilitiesare present at the mobiles which put a hard constraint on thepermissible complexity of such MUD techniques.

In the downlink, transmit diversity techniques are employed to protectthe transmitted information against fades in the propagationenvironment. Future cellular systems such as the 3GPP LTE system arepoised to deploy base-stations with two or four transmit antennas inaddition to legacy single transmit antenna base-stations and cater tomobiles with up to four receive antennas. Consequently, these systemswill have multi-antenna base-stations that employ space-only inner codes(such as long-term beam forming) and space-time (or space-frequency)inner codes based on the 2×2 orthogonal design (a.k.a. Alamouti design)and the 4×4 quasi-orthogonal design, respectively. The aforementionedinner codes are leading candidates for downlink transmit diversity inthe 3GPP LTE system for data as well as control channels. The systemdesigner must ensure that each user receives the signals transmitted onthe control channel with a large enough SINR, in order to guaranteecoverage and a uniform user experience irrespective of its position inthe cell. Inter-cell interference coupled with stringent complexitylimits at the mobile makes these goals significantly harder to achieve,particularly at the cell edge. The idea of using the structure of theco-channel interference to design filters has been proposed, where agroup decorrelator was designed for an uplink channel with two-users,each employing the Alamouti design as an inner code. There has also beenderived an improved group decorrelator for a multi-user uplink whereeach user employs the 4×4 quasi-orthogonal design of rate 1 symbol perchannel use. Improved group decorrelators have resulted in higherdiversity orders and have also preserved the (quasi-) decouplingproperty of the constituent (quasi-) orthogonal inner codes. Group LMMSEdemodulation is known in the prior art. However, in the conventionalGroup LMMSE demodulator the structure of the noise plus interferencecovariance matrix is not exploited to design filters, i.e., the improvedfilters are not used. This results in performance degradation.

Accordingly, there is a need for a method of reception on a downlinkchannel with improved interference suppression which exploits thestructure or the spatio-temporal correlation present in the co-channelinterference.

SUMMARY OF THE INVENTION

In accordance with the invention, a method for filtering in a wirelessdownlink channel, where all dominant transmitting sources use innercodes from a particular set, includes the steps of estimating a channelmatrix seen from a desired transmitter source in response to a pilot orpreamble signal; converting the estimated channel matrix into aneffective channel matrix responsive to the inner code of the desiredtransmitting source; estimating a covariance matrix of noise plusinterference in a linear model whose output is an equivalent of thereceived observations and in which the effective channel matrixcorresponding to each dominant transmitting source inherits thestructure of its inner code; computing asignal-to-noise-interference-ratio SINR responsive to the covariancematrix and the effective channel matrix corresponding to the desiredsource; and feeding back the computed SINR to the transmitter source.

In another aspect of the invention, a method for filtering in a wirelessdownlink channel, where all dominant sources use inner codes from aparticular set, includes the steps of estimating a channel matrix seenfrom a desired transmitter source in response to a pilot or preamblesignal; converting the estimated channel matrix into an effectivechannel matrix responsive to the inner code of the desired transmittingsource; estimating a covariance matrix of noise plus interference in alinear model whose output is an equivalent of the received observationsand in which the effective channel matrix corresponding to each dominanttransmitter source inherits the structure of its inner code; computing alinear minimum mean-squared error LMMSE filter responsive to thecovariance matrix and the effective channel matrix corresponding to thedesired source; and demodulating a signal from the desired transmittersource using the LMMSE filter.

BRIEF DESCRIPTION OF DRAWINGS

These and other advantages of the invention will be apparent to those ofordinary skill in the art by reference to the following detaileddescription and the accompanying drawings.

FIG. 1 is a schematic of adjacent cell interference in a cellularnetwork demonstrating a problem that the invention solves;

FIG. 2 is a receiver flow diagram for the case when the desired sourcetransmits at a fixed pre-determined rate and the inner code andmodulation and coding scheme (MCS) employed by it are known to thedestination, in accordance with the invention; and

FIG. 3 is a receiver flow diagram for the case when there is feedbacklink (channel) between the destination and the desired source and thedestination determines a rate (or equivalently an SINR) and feeds thatback to the desired source which will then transmit data at that rate,in accordance with the invention.

DETAILED DESCRIPTION

1. Introduction

The invention is directed to a cellular downlink where the user receivesdata from the serving base-station and is interfered by adjacentbase-stations, as shown by the diagram of FIG. 1. In general, theinvention is applicable in a scenario where the user (destination)receives signals simultaneously from multiple sources and is interestedin the signal transmitted by one (desired) source. The signalstransmitted by all base-stations have structure. In particular the innercodes used by all transmitters are from a set of inner codes[(2)-to-(5)]. Only the inner code of the desired source is known to thedestination, whereas those of the others may not be known.

The inventive method resides in the user (destination) receiver designin which we exploit the structure of the transmitted signals to designfilters that yield improved performance (henceforth referred to asimproved filters). Moreover, the computational cost of designing thesefilters can be reduced (Efficient filter design: see Section 4 below]and the demodulation complexity can be kept low, for example see Theorem1 below for specifics.

More specifically, the inventive method provides for group linear MMSEinterference suppression reception that not only suppresses theco-channel interference but also preserves the decoupling property ofthe (quasi-) orthogonal inner codes, which results in a much reduceddemodulation complexity. Moreover, the cost of computing the linear MMSEfilter is kept significantly low and the user does not decode theinterfering signals nor is it assumed to know either the number ofinterfering base-stations or the particular inner codes employed bythem. All these factors make this receiver highly suitable for practicaluse.

The process steps in accordance with the invention are shown in FIG. 2and FIG. 3. FIG. 2 is a receiver flow diagram for the case when thedesired source transmits at a fixed pre-determined rate and themodulation and coding scheme (MCS) employed is known to the destination.FIG. 3 is a receiver flow diagram for the case when there is a feedbacklink (channel) between the destination and the desired source. Thedestination determines a rate (or equivalently an SINR) and feeds thatback to the desired source, which will then transmit data at that rate.

Referring now to FIG. 2, the receiver is initialized 20 with an innercode and modulation and coding scheme (MCS) of the desired source, andthe received signal observations determined inaccordance with the matrixrelationship (6) (see formula derivation and details in 2. SystemsDescriptions, A. System Model). In response to a pilot or preamblesignal, an estimation of the channel matrix of the desired source isperformed 21 and followed by an estimation of a covraiance matrix of thenoise plus interference in an equivalent linear model (see formulas (38)and (39) below) 22. Using the structure of the covariance matrix, thereceiver efficiently computes an LMMSE filter (in accordance withsection 4. Efficient Inverse Computation). The LMMSE filter and adecoupling property according to theorem 1 (explained below) is used toefficiently demodulate the desired source or signal.

Referring now to FIG. 3, the receiver is initialized 30 with an innercode of the desired source, and the received signal observationsdetermined inaccordance with the matrix relationship (6) (see formuladerivation and details in 2. Systems Descriptions, A. System Model). Inresponse to a pilot or preamble signal, an estimation of the channelmatrix of the desired source is performed 31 and followed by anestimation of a covraiance matrix of the noise plus interference in anequivalent linear model (see formulas (38) and (39) below) 32. Using thestructure of the covariance matrix and a decoupling property thereceiver efficiently computes the signal-to-interference-noise SINR (inaccordance with section 4. Efficient Inverse Computation, Theorem 1) 33.the computed SINR is fed back to the transmitter.

2. System Descriptions

2.1. System Model

We consider a downlink fading channel, depicted in FIG. 1, where thesignals from K base-stations (BSs) are received by the user of interest.The user is equipped with N≧1 receive antennas and is served by only oneBS but interfered by the remaining K−1 others. The BSs are also equippedwith multiple transmit antennas and transmit using any one out of a setof three space-time inner codes. The 4×N channel output received overfour consecutive symbol intervals, is given byY=XH+V,   (1)where the fading channel is modeled by the matrix H. For simplicity, weassume a synchronous model. In practice this assumption is reasonable atthe cell edge and for small cells. Moreover, the model in (1) is alsoobtained over four consecutive tones in the downlink of a broadbandsystem employing OFDM such as the 3GPP LTE system. We partition H asH=[H₁ ^(T), . . . , H_(K) ^(T)]^(T), where H_(k) contains the rows of Hcorresponding to the k^(th) BS. The channel is quasi-static and thematrix H stays constant for 4 symbol periods after which it may jump toan independent value. The random matrix H is not known to thetransmitters (BSs) and the additive noise matrix V has i.i.d. CN (0,2σ²)elements.

The transmitted matrix X can be partitioned as=[x₁, . . . , x_(K)] where

$\begin{matrix}{{X_{k} = \begin{bmatrix}x_{k,1} & x_{k,2} & x_{k,3} & x_{k,4} \\{- x_{k,2}^{\dagger}} & x_{k,1}^{\dagger} & {- x_{k,4}^{\dagger}} & x_{k,3}^{\dagger} \\x_{k,3} & x_{k,4} & x_{k,1} & x_{k,2} \\{- x_{k,4}^{\dagger}} & x_{k,3}^{\dagger} & {- x_{k,2}^{\dagger}} & x_{k,1}^{\dagger}\end{bmatrix}},} & (2)\end{matrix}$when the k^(th) BS employs the quasi orthogonal design as its inner codeand

$\begin{matrix}{{X_{k} = \begin{bmatrix}x_{k,1} & x_{k,2} \\{- x_{k,2}^{\dagger}} & x_{k,1}^{\dagger} \\x_{k,3} & x_{k,4} \\{- x_{k,4}^{\dagger}} & x_{k,3}^{\dagger}\end{bmatrix}},} & (3)\end{matrix}$when the k^(th) BS employs the Alamouti design and finallyX_(k)=[x_(k,1)x_(k,2) x_(k,3) x_(k,4)]^(T),   (4)when the k^(th) BS has only one transmit antenna. The power constraintsare taken to be E{|x_(k,q)|²}≦2w_(k), 1≦k≦K, 1≦q≦4.

We also let the model in (1) include a BS with multiple transmitantennas which employs beam forming. In this caseX_(k)=[x_(k,1)x_(k,2)x_(k,3)x_(k,4)]^(T)u_(k),   (5)where u_(k) is the beam forming vector employed by BS k. Note that X_(k)in (5) can be seen as a space-only inner code. Also, the beam forming inwhich vector u_(k) only depends on the long-term channel information, isreferred to as long-term beam forming. We can absorb the vector u_(k)into the channel matrix H_(k) and consider BS k to be a BS with a singlevirtual antenna transmitting (4). Notice that the inner codes in(2)-to-(5) all have a rate of one symbol per-channel-use and we assumethat the desired BS employs any one out of these inner codes.Furthermore, we can also accommodate an interfering BS with multipletransmit antennas transmitting in the spatial multiplexing (a.k.a.BLAST) mode as well as an interfering BS with multiple transmit antennasemploying a higher rank precoding. In such cases, each physical orvirtual transmit antenna of the interfering BS can be regarded as avirtual interfering BS with a single transmit antenna transmitting (4).Then since the codewords transmitted by these virtual BSs areindependent they can be separately decoded when the interferencecancellation receiver is employed.

Let Y_(n) and V_(n) denote the n^(th), 1≦n≦N, columns of the matrices Yand V with Y_(n) ^(R), Y_(n) ^(I) and V_(n) ^(R), V_(n) ^(I) denotingtheir real and imaginary parts, respectively. We define the 8N×1 vectors{tilde over (y)}

[(Y₁ ^(R))^(T), (Y₁ ^(I))^(T), . . . , (Y_(N) ^(R))^(T), (Y_(N)^(I))^(T)]^(T), {tilde over (v)}

[(V₁ ^(R))^(T), (V₁ ^(I))^(T), . . . , (V_(N) ^(R))^(T), (V_(N) ^(I)^(T)]^(T). Then, {tilde over (y)} can be written as{tilde over (y)}={tilde over (H)}{tilde over (x)}+{tilde over (v)},  (6)where {tilde over (x)}

[{tilde over (x)}₁ ^(T), . . . , {tilde over (x)}_(K) ^(T)]^(T) with{tilde over (x)}=[x_(k,1) ^(R), . . . , x_(k,4) ^(R), x_(k,1) ^(I), . .. , x_(k,4) ^(I)]^(T) and {tilde over (H)}=[{tilde over (H)}₁, . . . ,{tilde over (H)}_(K)]=[{tilde over (h)}₁, . . . , {tilde over(h)}_(8K)]. Further when the k^(th) BS employs either thequasi-orthogonal design or the Alamouti design we can expand {tilde over(H)}_(k) as{tilde over (H)} _(k) =[{tilde over (h)} _(8k−7) , . . . , {tilde over(h)} _(8k)]=[(I _(N) {circle around (×)}C ₁){tilde over (h)}_(8k−7), (I_(N) {circle around (×)}C ₂){tilde over (h)}_(8k−7), . . . , (_(N){circle around (×)}C ₈){tilde over (h)}_(8k−7)],   (7)where {circle around (×)} denotes the Kronecker product, C₁=I₈ and

$\begin{matrix}{\begin{matrix}{C_{2} = {I_{2} \otimes \begin{bmatrix}0 & 1 & 0 & 0 \\{- 1} & 0 & 0 & 0 \\0 & 0 & 0 & 1 \\0 & 0 & {- 1} & 0\end{bmatrix}}} & {C_{3} = {I_{2} \otimes \begin{bmatrix}0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \\1 & 0 & 0 & 0 \\0 & 1 & 0 & 0\end{bmatrix}}} & {C_{4} = {I_{2} \otimes \begin{bmatrix}0 & 0 & 0 & 1 \\0 & 0 & {- 1} & 0 \\0 & 1 & 0 & 0 \\{- 1} & 0 & 0 & 0\end{bmatrix}}} \\{C_{5} = {J_{2} \otimes \begin{bmatrix}1 & 0 & 0 & 0 \\0 & {- 1} & 0 & 0 \\0 & 0 & 1 & 0 \\0 & 0 & 0 & {- 1}\end{bmatrix}}} & {C_{6} = {J_{2} \otimes \begin{bmatrix}0 & 1 & 0 & 0 \\1 & 0 & 0 & 0 \\0 & 0 & 0 & 1 \\0 & 0 & 1 & 0\end{bmatrix}}} & {C_{7} = {J_{2} \otimes \begin{bmatrix}0 & 0 & 1 & 0 \\0 & 0 & 0 & {- 1} \\1 & 0 & 0 & 0 \\0 & {- 1} & 0 & 0\end{bmatrix}}} \\\; & {C_{8} = {J_{2} \otimes \begin{bmatrix}0 & 0 & 0 & 1 \\0 & 0 & 1 & 0 \\0 & 1 & 0 & 0 \\1 & 0 & 0 & 0\end{bmatrix}}} & {{J_{2} = \begin{bmatrix}0 & {- 1} \\1 & 0\end{bmatrix}},}\end{matrix}{with}} & (8) \\{{\overset{\sim}{h}}_{{8k} - 7} = \left\{ \begin{matrix}{{{vec}\left( \left\lbrack {\left( H_{k}^{R} \right)^{T},\left( H_{k}^{I} \right)^{T}} \right\rbrack^{T} \right)},} & {{{for}\mspace{14mu}{quasi}\text{-}{orthogonal}},} \\{{{vec}\left( \left\lbrack {\left( H_{k}^{R} \right)^{T},0_{N \times 2},\left( H_{k}^{I} \right)^{T},0_{N \times 2}} \right\rbrack^{T} \right)},} & {{for}\mspace{14mu}{{Alamouti}.}}\end{matrix} \right.} & (9)\end{matrix}$Finally, for a single transmit antenna BS, defining {tilde over(C)}_(i)=(I_(N){circle around (×)}C_(i)), we have that

$\begin{matrix}\begin{matrix}{{\overset{\sim}{H}}_{k} = \left\lbrack {{\overset{\sim}{h}}_{{8k} - 7},\ldots\mspace{11mu},{\overset{\sim}{h}}_{8k}} \right\rbrack} \\{= \left\lbrack {{{\overset{\sim}{C}}_{1}{\overset{\sim}{h}}_{{8k} - 7}},{{- {\overset{\sim}{C}}_{2}}{\overset{\sim}{h}}_{{8k} - 7}},{{\overset{\sim}{C}}_{3}{\overset{\sim}{h}}_{{8k} - 7}},{{- {\overset{\sim}{C}}_{4}}{\overset{\sim}{h}}_{{8k} - 7}},} \right.} \\\left. {{{\overset{\sim}{C}}_{5}{\overset{\sim}{h}}_{{8k} - 7}},{{\overset{\sim}{C}}_{6}{\overset{\sim}{h}}_{{8k} - 7}},{{\overset{\sim}{C}}_{7}{\overset{\sim}{h}}_{{8k} - 7}},{{\overset{\sim}{C}}_{8}{\overset{\sim}{h}}_{{8k} - 7}}} \right\rbrack\end{matrix} & (10)\end{matrix}$and {tilde over (h)}_(8k−7)=vec([(H_(k) ^(R))^(T), 0_(N×3)(H_(k)^(I))^(T), 0_(N×3)]^(T)). Further, we let {tilde over (W)}

diag{w₁, . . . , w_(K)}{circle around (×)}I₈ and define{tilde over (H)} _(k)

[{tilde over (H)}_(k+1), . . . , {tilde over (H)}_(K)],   (11){tilde over (W)} _(k)

diag{w_(k+1), . . . , w_(K)}{circle around (×)}I₈.   (12)

2.2. Group Decoders

We consider the decoding of a frame received over T=4J, J≧1 consecutivesymbol intervals, where over a block of 4 consecutive symbol intervals(or four consecutive tones in an OFDMA system) we obtain a model of theform in (6). We first consider the group MMSE decision-feedback decoder(GM-DFD), where the user decodes and cancels the signals of as manyinterfering BSs as necessary before decoding the desired signal. We thenconsider the group MMSE decoder (GMD) where the user only decodes thedesired BS after suppressing the signals of all the interfering BSs.

2.2.1. Group MMSE Decision-Feedback Decoder (GM-DFD)

For ease of exposition, we assume that BS k is the desired one and thatthe BSs are decoded in the increasing order of their indices, i.e., BS 1is decoded first, BS 2 is decoded second and so on. Note that no attemptis made to decode the signals of BSs k+1 to K.

The soft statistics for the first BS over 4 consecutive symbolintervals, denoted by {tilde over (r)}₁, are obtained as,{tilde over (r)} ₁ ={tilde over (F)} _(1 y) ={tilde over (F)} ₁ {tildeover (H)} ₁ {tilde over (x)} ₁ +ũ ₁,   (13)where {tilde over (F)}₁ denotes the MMSE filter for BS 1 and is givenby, {tilde over (F)}₁={tilde over (H)}₁ ^(T)(σ²I+{tilde over (H)} ₁{tilde over (W)} ₁ {tilde over (H)} ₁ ^(T))⁻¹ and ũ₁={tilde over(F)}₁{tilde over (H)} ₁ {tilde over (x)} ₁ +{tilde over (F)}₁{tilde over(v)}₁ and note that

$\begin{matrix}{\underset{1}{\sum\limits^{\sim}}{\overset{\Delta}{=}{{E\left\lbrack {{\overset{\sim}{u}}_{1}\mspace{11mu}{\overset{\sim}{u}}_{1}^{T}} \right\rbrack} = {{{\overset{\sim}{F}}_{1}{\overset{\sim}{H}}_{1}} = {{{\overset{\sim}{H}}_{1}^{T}\left( {{\sigma^{2}I} + {{\overset{\sim}{H}}_{\overset{\_}{1}}{\overset{\sim}{W}}_{\overset{\_}{1}}{\overset{\sim}{H}}_{\overset{\_}{1}}^{T}}} \right)}^{- 1}{{\overset{\sim}{H}}_{1}.}}}}}} & (14)\end{matrix}$

To decode BS 1, ũ₁ is assumed to be a colored Gaussian noise vector withthe covariance in (14). Under this assumption, in the case when no outercode is employed by BS 1, the decoder obtains a hard decision {tildeover (x)}₁, using the maximum-likelihood (ML) rule over the model in(13). On the other hand, if an outer code is employed by BS 1soft-outputs for each coded bit in {tilde over (x)}₁ are obtained usingthe soft-output MIMO demodulator over the model in (13), which are thenfed to a decoder. The decoded codeword is re-encoded and modulated toobtain the decision vectors {{tilde over (x)}₁} over the frame ofduration 4J symbol intervals. In either case, the decision vectors{{tilde over (x)}₁} are fed back before decoding the subsequent BSs. Inparticular, the soft statistics for the desired k^(th) BS, are obtainedas,

$\begin{matrix}{{{\overset{\sim}{r}}_{k} = {{\overset{\sim}{F}}_{k}\left( {\overset{\sim}{y} - {\sum\limits_{j = 1}^{k - 1}{{\overset{\sim}{H}}_{j}{\hat{x}}_{j}}}} \right)}},} & (15)\end{matrix}$where {tilde over (F)}_(k) denotes the MMSE filter for BS k and is givenby, {tilde over (F)}_(k)={tilde over (H)}_(k) ^(T)(σ²I+{tilde over (H)}_(k) {tilde over (W)} _(k) {tilde over (H)} _(k) ^(T))⁻¹. The decoderfor the BS k is restricted to be a function of {{tilde over (r)}_(k)}and obtains the decisions {{circumflex over (x)}_(k)} in a similarmanner after assuming perfect feedback and assuming the additive noiseplus interference to be Gaussian. Note that the choice of decoding BSs 1to k−1 prior to BS k was arbitrary. In the sequel we will address theissue of choosing an appropriate ordered subset of interferers to decodeprior to the desired signal.

2.2.2. Group MMSE Decoder (GMD)

We assume that BS 1 is the desired one so that only BS 1 is decodedafter suppressing the interference from BSs 2 to K. The soft statisticsfor the desired BS are exactly {tilde over (r)}₁ given in (13). Notethat the MMSE filter for BS 1 can be written as {tilde over (F)}₁={tildeover (H)}₁ ^(T)({tilde over (R)} ₁ )⁻¹ where {tilde over (R)} ₁=σ²I+{tilde over (H)} ₁ {tilde over (W)} ₁ {tilde over (H)} ₁ ^(T),denotes the covariance matrix of the noise plus interference. Thus toimplement this decoder we only need estimates of the channel matrixcorresponding to the desired signal and the covariance matrix. Also, theuser need not be aware of the inner code employed by any of theinterfering BSs. In this work we assume perfect estimation of thechannel as well as the covariance matrices.

Inspecting the models in (13) and (15), we see that the complexity ofimplementing the ML detection (demodulation) for the k^(th) BS (underthe assumption of perfect feedback in case of GM-DFD) directly dependson the structure of the matrix {tilde over (F)}_(k){tilde over (H)}_(k).Ideally, the matrix {tilde over (F)}_(k){tilde over (H)}_(k) should bediagonal which results in a linear complexity and if most of theoff-diagonal elements of {tilde over (F)}_(k){tilde over (H)}_(k) arezero, then the cost of implementing the detector (demodulator) issignificantly reduced. Henceforth, for notational convenience we willabsorb the matrix {tilde over (W)} in the matrix {tilde over (H)}, i.e.,we will denote the matrix {tilde over (H)}{tilde over (W)} by {tildeover (H)}.

3. Decoupling Property

In this section we prove a property which results in significantly lowerdemodulation complexity. Note that the matrices defined in (8) have thefollowing properties:C _(l) ^(T) =C _(l) , l ε {1,3}, C _(l) ^(T) =−C _(l) , l ε {1, . . . ,8}\, {1,3}, C _(l) ^(T) C _(l) =I, ∀ l.   (16)In addition they also satisfy the ones given in Table 1, shown below,

TABLE I PROPERTIES OF {C₄} C₁ C₂ C₃ C₄ C₅ C₆ C₇ C₈ C₁ ^(T) C₁ C₂ C₃ C₄C₅ C₆ C₇ C₈ C₂ ^(T) −C₂ C₁ −C₄ C₃ C₆ −C₅ C₈ −C₇ C₃ ^(T) C₃ C₄ C₁ C₂ C₇C₈ C₅ C₆ C₄ ^(T) −C₄ C₃ −C₂ C₁ C₈ −C₇ C₆ −C₅ C₅ ^(T) −C₅ −C₆ −C₇ −C₈ C₁C₂ C₃ C₄ C₆ ^(T) −C₆ C₅ −C₈ C₇ −C₂ C₁ −C₄ C₃ C₇ ^(T) −C₇ −C₈ −C₅ −C₆ C₃C₄ C₁ C₂ C₈ ^(T) −C₈ C₇ −C₆ C₅ −C₄ C₃ −C₂ C₁where the matrix in the (i, j)^(th) position is obtained as the resultof C_(i) ^(T)C_(j). Thus, the set of matrices ∪_(i=1) ⁸{±C_(i)} isclosed under matrix multiplication and the transpose operation. We offerthe following theorem.

Theorem 1. Consider the decoding of the k^(th) BS. We have that{tilde over (H)} _(k) ^(T)(σ² I+{tilde over (H)} _(k) {tilde over (H)}_(k) ^(T))⁻¹ {tilde over (H)} _(k)=α_(k) C ₁+β_(k) C ₃,   (17)for some real-valued scalars α_(k), β_(k) Note that α_(k), β_(k) dependon {tilde over (H)}_(k) and {tilde over (H)} _(k) but for notationalconvenience we do not explicitly indicate the dependence.Proof To prove the theorem, without loss of generality we will onlyconsider decoding of the first BS. We first note that

$\begin{matrix}{{{{{\sigma^{2}I} + {{\overset{\sim}{H}}_{\overset{\_}{1}}{\overset{\sim}{H}}_{\overset{\_}{1}}^{T}}} = {\sum\limits_{i = 1}^{8}{\left( {I_{N} \otimes C_{i}} \right){\overset{\sim}{A}\left( {I_{N} \otimes C_{i}^{T}} \right)}}}},{where}}{\overset{\sim}{A}\overset{\Delta}{=}{{{\sigma^{2}/8}I} + {\sum\limits_{k = 1}^{K}{{\overset{\sim}{h}}_{{8k} - 7}{{\overset{\sim}{h}}_{{8k} - 7}^{T}.}}}}}} & (18)\end{matrix}$Let {tilde over (B)}

(σ²I+{tilde over (H)} _(1{tilde over (H)} 1) ^(T))⁻¹ and note that{tilde over (B)}>0.Using the properties of the matrices {C_(i)} in (16) and Table 1, it isreadily verified that

${\left( {I_{N} \otimes C_{i}} \right){\overset{\sim}{B}\left( {I_{N} \otimes C_{i}^{T}} \right)}} = {\left( {\left( {I_{N} \otimes C_{i}} \right)\left( {\sum\limits_{i = 1}^{8}\;{\left( {I_{N} \otimes C_{i}} \right){\overset{\sim}{A}\left( {I_{N} \otimes C_{i}^{T}} \right)}}} \right)\left( {I_{N} \otimes C_{i}^{T}} \right)} \right)^{- 1} = {\overset{\sim}{B}.}}$As a consequence we can expand {tilde over (B)} as

$\begin{matrix}{\overset{\sim}{B} = {\sum\limits_{i = 1}^{8}{\left( {I_{N} \otimes C_{i}} \right)\left( {\overset{\sim}{B}/8} \right){\left( {I_{N} \otimes C_{i}^{T}} \right).}}}} & (19)\end{matrix}$Next, invoking the properties of the matrices {C_(i)} and using the factthat {tilde over (B)}={tilde over (B)}^(T), it can be seen that thematrix

${\left( {I_{N} \otimes C_{k}^{T}} \right)\left( {\sum\limits_{i = 1}^{8}{\left( {I_{N} \otimes C_{i}} \right)\left( {\overset{\sim}{B}/8} \right)\left( {I_{N} \otimes C_{i}^{T}} \right)}} \right)\left( {I_{N} \otimes C_{j}} \right)},$where 1≦k, j≦8, is identical to {tilde over (B)} when k=j, is identicalwhen (k, j) or (j, k) ε {(1, 3),(2, 4),(5, 7),(6, 8)} and is skewsymmetric otherwise. The desired property in (17) directly follows fromthese facts.

Note that Theorem 1 guarantees the quasi-orthogonality property evenafter interference suppression. In particular, the important point whichcan be inferred from Theorem 1 is that the joint detection(demodulation) of four complex QAM symbols (or eight PAM symbols) issplit into four smaller joint detection (demodulation) problemsinvolving a pair of PAM symbols each. Thus with four M-QAM complexsymbols the complexity is reduced from

(M⁴) to

(M). Furthermore, specializing Theorem 1 to the case when the desired BS(say BS k ) employs the quasi-orthogonal design and there are nointerferers, we see that{tilde over (H)} _(k) ^(T) {tilde over (H)} _(k)=α_(k) C ₁+β_(k) C ₃.  (20)(20) implies that maximum likelihood decoding complexity of thequasi-orthogonal design is

(M) instead of the more pessimistic

(M²) claimed by the original contribution. We note that a differentquasi-orthogonal design referred to as the minimum decoding complexityquasi-orthogonal design, was proposed for a point-to-point MIMO systemin the prior art, which was shown to have an ML decoding complexity of

(M).

Finally, it can be inferred from the sequel that β_(k)=0 in (17), whenno BS in {k, k+1, . . . , K} employs the quasi orthogonal design.

4. Efficient Inverse Computation

In this section we utilize the structure of the covariance matrix {tildeover (R)}

σ²I+{tilde over (H)}{tilde over (H)}^(T) to efficiently compute itsinverse. Consequently, the complexity involved in computing the MMSEfilters is significantly reduced. Let {tilde over (S)}={tilde over(R)}⁻¹. From (18) and (19), it follows that we can expand both {tildeover (R)}, {tilde over (S)} as

$\begin{matrix}{{\overset{\sim}{R} = \begin{bmatrix}{\sum\limits_{i = 1}^{8}{C_{i}P_{11}C_{i}^{T}}} & \ldots & {\sum\limits_{i = 1}^{8}{C_{i}P_{1N}C_{i}^{T}}} \\\vdots & \ldots & \vdots \\{\sum\limits_{i = 1}^{8}{C_{i}P_{N\; 1}C_{i}^{T}}} & \ldots & {\sum\limits_{i = 1}^{8}{C_{i}P_{NN}C_{i}^{T}}}\end{bmatrix}}{{\overset{\sim}{S} = \begin{bmatrix}{\sum\limits_{i = 1}^{8}{C_{i}Q_{11}C_{i}^{T}}} & \ldots & {\sum\limits_{i = 1}^{8}{C_{i}Q_{1N}C_{i}^{T}}} \\\vdots & \ldots & \vdots \\{\sum\limits_{i = 1}^{8}{C_{i}Q_{N\; 1}C_{i}^{T}}} & \ldots & {\sum\limits_{i = 1}^{8}{C_{i}Q_{NN}C_{i}^{T}}}\end{bmatrix}},}} & (21)\end{matrix}$where {P_(ij), Q_(ij)}_(i,j=1) ^(N) are 8×8 matrices such thatP_(ji)=P_(ij) ^(T), Q_(ji)=Q_(ij) ^(T), 1≦i, j≦N.   (22)

The inverse {tilde over (S)} can be computed recursively starting fromthe bottom-right sub-matrix of {tilde over (R)} using the followinginverse formula for block partitioned matrices

$\begin{matrix}{\begin{bmatrix}E & F \\G & H\end{bmatrix}^{- 1} = \mspace{76mu}\begin{bmatrix}\left( {E - {{FH}^{- 1}G}} \right)^{- 1} & {{- \left( {E - {{FH}^{- 1}G}} \right)^{- 1}}{FH}^{- 1}} \\{{- H^{- 1}}{G\left( {E - {{FH}^{- 1}G}} \right)}^{- 1}} & {H^{- 1} + {H^{- 1}{G\left( {E - {{FH}^{- 1}G}} \right)}^{- 1}{FH}^{- 1}}}\end{bmatrix}} & (23)\end{matrix}$The following properties ensure that the computations involved indetermining {tilde over (S)} are dramatically reduced.First, note that the 8×8 sub-matrices in (21) belong to the set ofmatrices

$\begin{matrix}{\left\{ {\overset{\Delta}{=}{{\sum\limits_{i = 1}^{8}{C_{i}A\; C_{i}^{T}\text{:}\mspace{11mu} A}} \in {IR}^{8 \times 8}}} \right\}.} & (24)\end{matrix}$

It is evident that

is closed under the transpose operation. Utilizing the structure of thematrices {C_(i)} in (8), after some algebra it can be shown that the set

can also be written as

$\begin{matrix}{{\overset{\Delta}{=}\left\{ {\sum\limits_{i = 1}^{8}{b_{i}{S_{i}:{\left\lbrack {b_{1},\ldots\mspace{11mu},b_{8}} \right\rbrack^{T} \in {IR}^{8}}}}} \right\}},} & (25)\end{matrix}$where S₁=I₈, S₅=J₂{circle around (×)}I₄, S₃=C₃ and

$\begin{matrix}{{S_{2} = {\begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \otimes \begin{bmatrix}0 & 1 & 0 & 0 \\{- 1} & 0 & 0 & 0 \\0 & 0 & 0 & 1 \\0 & 0 & {- 1} & 0\end{bmatrix}}}\begin{matrix}{S_{4} = {\begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \otimes \begin{bmatrix}0 & 0 & 0 & 1 \\0 & 0 & {- 1} & 0 \\0 & 1 & 0 & 0 \\{- 1} & 0 & 0 & 0\end{bmatrix}}} & {S_{6} = {\begin{bmatrix}0 & 1 \\1 & 0\end{bmatrix} \otimes \begin{bmatrix}0 & {- 1} & 0 & 0 \\1 & 0 & 0 & 0 \\0 & 0 & 0 & {- 1} \\0 & 0 & 1 & 0\end{bmatrix}}}\end{matrix}\begin{matrix}{S_{7} = {J_{2} \otimes \begin{bmatrix}0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \\1 & 0 & 0 & 0 \\0 & 1 & 0 & 0\end{bmatrix}}} & {S_{8} = {\begin{bmatrix}0 & 1 \\1 & 0\end{bmatrix} \otimes {\begin{bmatrix}0 & 0 & 0 & 1 \\0 & 0 & {- 1} & 0 \\0 & 1 & 0 & 0 \\{- 1} & 0 & 0 & 0\end{bmatrix}.}}}\end{matrix}} & (26)\end{matrix}$

It is readily seen that the set

in (25) is a matrix group under matrix addition and note that any matrixB ε

is parametrized by eight scalars. The matrices {S_(i)} have thefollowing properties.S _(l) ^(T) =S _(l) , l ε {1, 3}, S _(l) ^(T) =−S _(l) , l ε {1, . . . ,8}\{1, 3}, S _(l) ^(T) S _(l) =, ∀ l   (27)in addition to the ones given in Table II, shown below.

TABLE II PROPERTIES OF {S₁} S₁ S₂ S₃ S₄ S₅ S₆ S₇ S₈ S₁ ^(T) S₁ S₂ S₃ S₄S₅ S₆ S₇ S₈ S₂ ^(T) −S₂ S₁ −S₄ S₃ −S₆ S₅ S₈ −S₇ S₃ ^(T) S₃ S₄ S₁ S₂ S₇−S₈ S₅ −S₆ S₄ ^(T) −S₄ S₃ −S₂ S₁ S₈ S₇ −S₆ −S₅ S₅ ^(T) −S₅ S₆ −S₇ −S₈ S₁−S₂ S₃ S₄ S₆ ^(T) −S₆ −S₅ S₈ −S₇ S₂ S₁ S₄ −S₃ S₇ ^(T) −S₇ −S₈ −S₅ S₆ S₃−S₄ S₁ S₂ S₈ ^(T) −S₈ S₇ S₆ S₅ −S₄ −S₃ −S₂ S₁Using these properties it can be verified that the set {±S_(i)}_(i=1) ⁸is closed under matrix multiplication and the transpose operation. Thefollowing lemma provides useful properties of the set

.

Lemma 1.

$\begin{matrix}{A,\left. {B \in}\Rightarrow{{AB} \in} \right.} & (28) \\{A = {{A^{T} \in \left. \Leftrightarrow A \right.} = {{{a_{1}I_{8}} + {a_{2}S_{3}}} = {{a_{1}I_{8}} + {a_{2}C_{3}}}}}} & (29) \\{A = {\left. {{{{{a_{1}I_{8}} + {a_{2}S_{3}}}\&}\;{A}} \neq 0}\Rightarrow A^{- 1} \right. = {{\frac{a_{1}}{a_{1}^{2} - a_{2}^{2}}I_{8}} - {\frac{a_{2}}{a_{1}^{2} - a_{2}^{2}}S_{3}}}}} & (30)\end{matrix}$for some scalars a₁, a₂ and

$\begin{matrix}{{{\sum\limits_{i = 1}^{8}{C_{i}{BC}_{i}^{T}}} = {{{b_{1}I_{8}} + {b_{2}S_{3}}} = {{b_{1}I_{8}} + {b_{2}C_{3}}}}},{{\forall B} = {B^{T} \in {IR}^{8 \times 8}}}} & (31) \\{\left. {Q \in}\Rightarrow{QQ}^{T} \right. = {{q_{1}I_{8}} + {q_{2}C_{3}}}} & (32)\end{matrix}$for some scalars b₁, b₂, q₁, q₂.Proof. The facts in (28) and (29) follow directly by using the alternateform of

in (25) along with the properties of {S_(i)}. (30) follows after somesimple algebra whereas (31) follows from (29) upon using the definitionof

in (24). Finally (32) follows from (28) and (29) after recalling thatthe set

is closed under the transpose operation.Thus for any A, B ε

, the entire 8×8 matrix AB can be determined by only computing any oneof its rows (or columns). The set

is not a matrix group since it contains singular matrices. However theset of all nonsingular matrices in

forms a matrix group as shown by the following lemma.

Lemma 2. If A ε

such that |A|≠0 then A⁻¹ ε

. The set of all non-singular matrices in

denoted by

, forms a matrix group under matrix multiplication and is given by

$\begin{matrix}{= \left\{ {{{{{\sum\limits_{i = 1}^{8}{b_{i}S_{i}{\text{:}\left\lbrack {b_{1},\ldots\mspace{11mu},b_{8}} \right\rbrack}^{T}}} \in {IR}^{8}}\&}\;{\sum\limits_{i = 1}^{8}b_{i}^{2}}} \neq {{\pm 2}\left( {{b_{1}b_{3}} + {b_{2}b_{4}} + {b_{5}b_{7}} - {b_{6}b_{8}}} \right)}} \right\}} & (33)\end{matrix}$Proof. Consider any non-singular A ε

so that A⁻¹ exists. We can use the definition of

in (24) to expand A as

$\sum\limits_{j = 1}^{8}{C_{j}{QC}_{j}^{T}}$for some Q ε IR^(8×8). Consequently

$A^{- 1} = {\left( {\sum\limits_{j = 1}^{8}{C_{j}{QC}_{j}^{T}}} \right)^{- 1}.}$Next, as done in the proof of Theorem 1, using the properties of {C_(i)}we can show that

${C_{i}A^{- 1}C_{i}^{T}} = {\left( {{C_{i}\left( {\sum\limits_{j = 1}^{8}{C_{j}{QC}_{j}^{T}}} \right)}C_{i}^{T}} \right)^{- 1} = {A^{- 1}.}}$Thus, we have that

$\begin{matrix}{{A^{- 1} = {\sum\limits_{j = 1}^{8}{{C_{J}\left( {A^{- 1}/8} \right)}C_{j}^{T}}}},} & (34)\end{matrix}$so that A⁻¹ ε

. Next, using the alternate form of

in (25) we must have that

${A = {\sum\limits_{i = 1}^{8}{a_{i}S_{i}}}},$for some {a_(i)}. Since the non-singular A ε

we must have that AA^(T) ε

and note that|A|≠0

|AA ^(T)|>0.   (35)Invoking the property in (32), after some algebra we see that

$\begin{matrix}{{AA}^{T} = {{\sum\limits_{i = 1}^{8}{a_{i}^{2}I_{8}}} + {2\left( {{a_{1}a_{3}} + {a_{2}a_{4}} + {a_{5}a_{7}} - {a_{6}a_{8}}} \right){C_{3}.}}}} & (36)\end{matrix}$

Then it can be verified that

$\begin{matrix}{{{AA}^{T}} = {\left( {\left( {\sum\limits_{i = 1}^{8}a_{i}^{2}} \right)^{2} - {4\left( {{a_{1}a_{3}} + {a_{2}a_{4}} + {a_{5}a_{7}} - {a_{6}a_{8}}} \right)^{2}}} \right)^{4}.}} & (37)\end{matrix}$From (35) and (37), we see that the set

is precisely the set of all non-singular matrices in

Since this set includes the identity matrix, is closed under matrixmultiplication and inversion, it is a matrix group under matrixmultiplication.

Lemma 2 is helpful in computing the inverses of the principalsub-matrices of {tilde over (R)}. Note that since {tilde over (R)}>0,all its principal sub-matrices are also positive-definite and hencenon-singular. Then, to compute the inverse of any A ε

we can use Lemma 2 to conclude that A⁻¹ ε

so that we need to determine only the eight scalars which parametrizeA⁻¹. As mentioned before, in this work we assume that a perfect estimateof the covariance matrix {tilde over (R)} is available. In practice thecovariance matrix {tilde over (R)} must be estimated from the receivedsamples. We have observed that the Ledoit and Wolf's (LW) estimator [10]works well in practice. For completeness we provide the LW estimator.Let {{tilde over (y)}_(n)}_(n=1) ^(S) be the S vectors which areobtained from samples received over 4S consecutive symbol intervals overwhich the effective channel matrix {tilde over (H)} in (6) is constant.These samples could also be received over consecutive tones and symbolsin an OFDMA system. Then the LW estimate {tilde over ({circumflex over(R)} is given by{tilde over ({circumflex over (R)}=(1−ρ){circumflex over (Q)}+μρI,  (38)where

$\hat{Q} = {\frac{1}{S}{\sum\limits_{n = 1}^{S}{{\overset{\sim}{y}}_{n}{\overset{\sim}{y}}_{n}^{T}}}}$and

$\begin{matrix}{\rho = {{\min\left\{ {\frac{\sum\limits_{n = 1}^{S}{{{{\overset{\sim}{y}}_{n}{\overset{\sim}{y}}_{n}^{T}} - \hat{Q}}}_{F}^{2}}{S^{2}{{\hat{Q} - {\mu\; I}}}_{F}^{2}},1} \right\}\mspace{14mu}{and}\mspace{14mu}\mu} = {\frac{{tr}\left( \hat{Q} \right)}{8N}.}}} & (39)\end{matrix}$5. GM-DFD: Decoding Order

It is well known that the performance of decision feedback decoders isstrongly dependent on the order of decoding. Here however, we are onlyconcerned with the error probability obtained for the signal of thedesired (serving) BS. Note that the GM-DFD results in identicalperformance for the desired BS for any two decoding orders where theordered sets of BSs decoded prior to the desired one, respectively, areidentical. Using this observation, we see that the optimal albeitbrute-force method to decode the signal of the desired BS using theGM-DFD would be to sequentially examine

$\sum\limits_{i = 0}^{K - 1}{{i!}\begin{pmatrix}{K - 1} \\i\end{pmatrix}}$possible decoding orders, where the ordered sets of BSs decoded prior tothe desired one are distinct for any two decoding orders, and pick thefirst one where the signal of desired BS is correctly decoded, which inpractice can be determined via a cyclic redundancy check (CRC). Althoughthe optimal method does not examine all K! possible decoding orders, itcan be prohibitively complex. We propose an process which determines theBSs (along with the corresponding decoding order) that must be decodedbefore the desired one. The remaining BSs are not decoded.

The challenge in designing such a process is that while canceling acorrectly decoded interferer clearly aids the decoding of the desiredsignal, the subtraction of even one erroneously decoded signal canresult in a decoding error for the desired signal. Before providing theprocess we need to establish some notation. We let

={1, . . . , K} denote the set of BSs and let k denote the index of thedesired BS. Let R_(j), 1≦j≦K denote the rate (in bits per channel use)at which the BS j transmits. Also, we let π denote any ordered subset ofK having k as its last element. For a given π, we let π(1) denote itsfirst element, which is also the index of the BS decoded first by theGM-DFD, π(2) denote its second element, which is also the index of theBS decoded second by the GM-DFD and so on. Finally let |π| denote thecardinality of π and let Q denote the set of all possible such π.

Let us define m({tilde over (H)}, j, S) to be a metric whose value isproportional to the chance of successful decoding of BS j in thepresence of interference from BSs in the set S. A large value of themetric implies a high chance of successfully decoding BS j. Further, weadopt the convention that m({tilde over (H)}, φ, S)=∞, ∀S , since noerror is possible in decoding the empty set. Define {tilde over(H)}_(S)=[{tilde over (H)}_(j)]_(jεS). Let I({tilde over (H)}, j, S)denote an achievable rate (in bits per channel use) obtained post MMSEfiltering for BS j in the presence of interference from BSs in the set Sand note that

$\begin{matrix}\begin{matrix}{{I\left( {\overset{\sim}{H},j,S} \right)} = {\frac{1}{2}\log{{I_{8} + {{{\overset{\sim}{H}}_{j}^{T}\left( {{\sigma^{2}I} + {{\overset{\sim}{H}}_{S}{\overset{\sim}{H}}_{S}^{T}}} \right)}^{- 1}{\overset{\sim}{H}}_{J}}}}}} \\{{= {2\;\log\left( {\left( {1 + \alpha_{j,S}} \right)^{2} - \beta_{j,S}^{2}} \right)}},}\end{matrix} & (40)\end{matrix}$where the second equality follows upon using (17). In this work wesuggest the following three examples for m({tilde over (H)}, j, S)

$\begin{matrix}{\mspace{79mu}{{{m\left( {\overset{\sim}{H},j,S} \right)} = {{I\left( {\overset{\sim}{H},j,S} \right)} - R_{j}}},}} & (41) \\{\mspace{79mu}{{{m\left( {\overset{\sim}{H},j,S} \right)} = {{I\left( {\overset{\sim}{H},j,S} \right)}/R_{j}}},\mspace{20mu}{and}}} & (42) \\\begin{matrix}{{m\left( {\overset{\sim}{H},j,S} \right)} = {\max\limits_{\rho \in {\lbrack{0,1}\rbrack}}{\rho\left( {{\frac{1}{2}\log{{I_{8} + {\frac{1}{1 + \rho}{{\overset{\sim}{H}}_{j}^{T}\left( {\sigma^{2} + {{\overset{\sim}{H}}_{S}{\overset{\sim}{H}}_{S}^{T}}} \right)}^{- 1}{\overset{\sim}{H}}_{j}}}}} - R_{j}} \right)}}} \\{= {\max\limits_{\rho \in {\lbrack{0,1}\rbrack}}{{\rho\left( {{2\;{\log\left( {\left( {1 + \frac{\alpha_{j,S}}{1 + \rho}} \right)^{2} - \frac{\beta_{j,S}^{2}}{\left( {1 + \rho} \right)^{2}}} \right)}} - R_{j}} \right)}.}}}\end{matrix} & (43)\end{matrix}$

Note that the metric in (43) is the Gaussian random coding errorexponent obtained after assuming BSs in the set S to be Gaussianinterferers. All three metrics are applicable to general non-symmetricsystems where the BSs may transmit at different rates. It can be readilyverified that all the three metrics given above also satisfy thefollowing simple factm({tilde over (H)}, j, S)≧m({tilde over (H)}, j,

∀S ⊂ R ⊂K.   (44)Now, for a given π ε Q, the metric m(H, k,

\∪_(j=1) ^(|π|)π(j)) indicates the decoding reliability of the desiredsignal assuming perfect feedback from previously decoded signals,whereas min_(1≦j≦|π|−1)m({tilde over (H)}, π(j),

\_(i=1) ^(j)π(i)) can be used to measure the quality of the fed-backdecisions. Thus a sensible metric to select π is

$\begin{matrix}{{f\left( {H,\pi} \right)}\overset{\Delta}{=}{\min\limits_{1 \leq j \leq {\pi }}{{m\left( {\overset{\sim}{H},{\pi(j)},{{??}\backslash{\bigcup\limits_{i = 1}^{j}{\pi(i)}}}} \right)}.}}} & (45)\end{matrix}$

We are now ready to present our process.

-   -   1. Initialize: S={1, . . . , K} and {circumflex over (π)}=φ.    -   2. Among all BS indices j ε S, select the one having the highest        value of the metric m({tilde over (H)}, j, S \j) and denote it        by ĵ.    -   3. Update S=S\ĵ and {circumflex over (π)}={{circumflex over        (π)}, ĵ}.    -   4. If ĵ=k then stop else go to Step 2.        The proposed greedy process is optimal in the following sense.

Theorem 2. The process has the following optimality.

$\begin{matrix}{\hat{\pi} = {\arg\;{\max\limits_{\pi \in \underset{\_}{Q}}{{f\left( {\overset{\sim}{H},\pi} \right)}.}}}} & (46)\end{matrix}$Proof. Let π^((i)) be any other valid ordered partition in Q such thatits first i elements are identical to those of {circumflex over (π)}.Construct another ordered partition {circumflex over (π)}^((i+1)) asfollows:π^((i+1))(j)=π^((i))(j)={circumflex over (π)}(j), 1≦j≦i,π^((i+1))(i+1)={circumflex over (π)}(i+1),π^((i+1))(j+1)=π^((i))(j)\{circumflex over (π)}(i+1), i+1≦j≦|π^((i))| &{circumflex over (π)}(i+1)≠k.   (47)Note that π^(i+1) ε Q. Now, to prove optimality it is enough to showthatf({tilde over (H)}, π ^((i+1)))≧f({tilde over (H)}, π ^((i))).   (48)To show (48) we first note thatm({tilde over (H)}, π ^((i+1))(j), K\∪_(q=1) ^(j)π^((i+1))(q))=m({tildeover (H)}, π^((i))(j), K\∪_(q=1) ^(j)π^((i))(q)), 1≦j≦i.   (49)Since the greedy process selects the element (BS) with the highestmetric at any stage, we have thatm({tilde over (H)}, π ^((i+1))(i+1),\∪_(q=1)^(i+1)π^((i+1))(q))≧m({tilde over (H)}, π ^((i))(i+1),\∪_(q=1)^(i+1)π^((i))(q)).   (50)If {circumflex over (π)}(i+1) equals k then (49) and (50) prove thetheorem, else using (85) we see thatm({tilde over (H)}, π ^((i+1))(j+1),\∪_(q=1)^(j+1)π^((i+1))(q))≧m({tilde over (H)}, π ^((i))(j),\∪_(q=1)^(j)π^((i))(q)), i+1≦j≦|π^((i))|.   (51)From (51), (50) and (49) we have the desired result.

The following remarks are now in order.

-   -   The metrics in (41)-to-(43) are computed assuming Gaussian input        alphabet and Gaussian interference. We can exploit the available        modulation information by computing these metrics for the exact        alphabets (constellations) used by all BSs but this makes the        metric computation quite involved. We can also compute the        metric m({tilde over (H)}, j,) by assuming the BSs in the set of        interferers S to be Gaussian interferers but using the actual        alphabet for the BS j, which results in a simpler metric        computation. In this work, we use the first (and simplest)        option by computing the metrics as in (82)-to-(84). Moreover,        the resulting decoding orders are shown in the sequel to perform        quite well with finite alphabets and practical outer codes.    -   A simple way to achieve the performance of the optimal GM-DFD        with a lower average complexity, is to first examine the        decoding order suggested by the greedy process and only in the        case the desired BS is decoded erroneously, to sequentially        examine the remaining

${\sum\limits_{i = 0}^{K - 1}{{i!}\begin{pmatrix}{K - 1} \\i\end{pmatrix}}} - 1$decoding orders.

-   -   Note that when f({tilde over (H)}, {circumflex over (π)})—where        π is the order determined by the greedy rule—is negative, less        than 1 and equal to 0 when m({tilde over (H)}, j, S) is computed        according to (41), (42) and (43), respectively, we can infer        that with high probability at least one BS will be decoded in        error. In particular, suppose we use the metric in (41). Then an        error will occur (with high probability) for the desired BS k        even after perfect cancellation of the previous BSs if m({tilde        over (H)}, k,        \∪_(j=1) ^(|{circumflex over (π)}|){circumflex over (π)}(j))<0.        On the other hand, when m({tilde over (H)}, k,        \∪_(j=1) ^(|{circumflex over (π)}|){circumflex over (π)}(j))>0        but min_(1≦j≦|{circumflex over (π)}|−1)m({tilde over (H)},        {circumflex over (π)}(j),        \∪_(i=1) ^(j){circumflex over (π)}(i))<0, we can infer that the        decoding of the desired BS will be affected (with high        probability) by error propagation from BSs decoded previously.        Unfortunately, it is hard to capture the effect of error        propagation precisely and we have observed that the assumption        that error propagation always leads to a decoding error for the        desired BS is quite pessimistic.        6. Special Cases

In this section a lower complexity GMD is obtained at the cost ofpotential performance degradation by considering only two consecutivesymbol intervals when designing the group MMSE filter. Further, when nointerfering BS employs the quasi-orthogonal design no loss of optimalityis incurred. Similarly, when none of the BSs employ the quasi-orthogonaldesign, without loss of optimality we can design the GM-DFD byconsidering only two consecutive symbol intervals.

In this case, the 2×N channel output received over two consecutivesymbol intervals can be written as (1). As before, the transmittedmatrix X can be partitioned as X=[X₁, . . . , X_(K)] but where

$\begin{matrix}{{X_{k} = \begin{bmatrix}x_{k,1} & x_{k,2} \\{- x_{k,2}^{\dagger}} & x_{k,1}^{\dagger}\end{bmatrix}},} & (52)\end{matrix}$when the k^(th) BS employs the Alamouti design andX_(k)=[x_(k,1)x_(k,2)]^(T),   (53)when the k^(th) BS has only one transmit antenna. Note that over twoconsecutive symbol intervals, an interfering BS employing thequasi-orthogonal design is equivalent to two dual transmit antenna BSs,each employing the Alamouti design. Then we can obtain a linear model ofthe form in (6), where {tilde over (x)}[{tilde over (x)}₁ ^(T), . . . ,{tilde over (x)}_(K) ^(T)]^(T) and {tilde over (x)}_(k)=[x_(k,1) ^(R),x_(k,2) ^(R), x_(k,1) ^(I), x_(k,2) ^(I)]^(T) with {tilde over(H)}=[{tilde over (H)}₁, . . . , {tilde over (H)}_(K)]=[{tilde over(h)}₁, . . . , {tilde over (h)}_(4K)]. The matrix {tilde over (H)}_(k)corresponding to a BS employing the Alamouti design can be expanded as{tilde over (H)} _(k) =[{tilde over (h)} _(4k−3) , . . . , {tilde over(h)} _(4k) ]=[{tilde over (h)} _(4k−3), (I _(N) {circle around (×)}D₁){tilde over (h)} _(4k−3), (I _(N) {circle around (×)}D ₂){tilde over(h)} _(4k−3), (I _(N) {circle around (×)}D ₃){tilde over (h)} _(4k−3)],  (54)with {tilde over (h)}_(4k−3)=vec([(H_(k) ^(R))^(T), (H_(k)^(I))^(T))^(T), whereas that corresponding to a single transmit antennaBS can be expanded as{tilde over (H)} _(k) =[{tilde over (h)} _(4k−3) , . . . , {tilde over(h)} _(4k) ]=[{tilde over (h)} _(4k−3), −(I _(N) {circle around (×)}D₁){tilde over (h)} _(4k−3), (I _(N) {circle around (×)}D ₂){tilde over(h)} _(4k−3), (i _(N) {circle around (×)}d ₃){tilde over (h)} _(4k−3)],  (55)with {tilde over (h)}_(4k−3)=vec([(H_(k) ^(R))^(T), 0_(N×1), (H_(k)^(I))^(T), 0_(N×1)]^(T)). The matrices D₁, D₂, D₃ are given by

$\begin{matrix}{{D_{1}\overset{\Delta}{=}{{\begin{bmatrix}0 & 1 & 0 & 0 \\{- 1} & 0 & 0 & 0 \\0 & 0 & 0 & 1 \\0 & 0 & {- 1} & 0\end{bmatrix}\mspace{14mu} D_{2}}\overset{\Delta}{=}\begin{bmatrix}0 & 0 & {- 1} & 0 \\0 & 0 & 0 & 1 \\1 & 0 & 0 & 0 \\0 & {- 1} & 0 & 0\end{bmatrix}}}{D_{3}\overset{\Delta}{=}{\begin{bmatrix}0 & 0 & 0 & {- 1} \\0 & 0 & {- 1} & 0 \\0 & 1 & 0 & 0 \\1 & 0 & 0 & 0\end{bmatrix}.}}} & (56)\end{matrix}$Note that the matrices defined in (56) have the following properties:D _(l) ^(T) =−D _(l) , D _(l) ^(T) D _(l) =I, 1≦l≦3D ₂ ^(T) D ₁ =−D ₃ , D ₂ ^(T) D ₃ =D ₁ , D ₁ ^(T) D ₃ =−D ₂.   (57)Using the properties given in (57), we can prove the following theoremin a manner similar to that of Theorem 1. The proof is skipped forbrevity.

Theorem 3. Consider the decoding of the k^(th) BS. We have that{tilde over (H)} _(k) ^(T)(σ² I+{tilde over (H)} _(k) {tilde over (H)}_(k) ^(T))⁻¹ {tilde over (H)} _(k)=α_(k) I ₄.   (58)Let Ũ{tilde over (U

)}σ²I+{tilde over (H)}{tilde over (H)}^(T) denote a sample covariancematrix obtained by considering two consecutive symbol intervals. Define_(k)=[2k−1,2k,4N+2k−1,4N+2k], 1≦k≦2N and e=[e₁, . . . , e_(2N)] and letM denote the permutation matrix obtained by permuting the rows of I_(8N)according to e. Then, it can be verified that the matrices in (7) and(10), corresponding to Alamouti and single antenna BSs (over four symbolintervals), are equal (up to a column permutation) to M(I₂{circle around(×)}{tilde over (H)}_(k)), where {tilde over (H)}_(k) is given by (54)and (55), respectively. Consequently, the covariance matrix {tilde over(R)} in (21) is equal to M(I₂{circle around (×)}Ũ)M^(T), when noquasi-orthogonal BSs are present, so that {tilde over(R)}⁻¹=M(I₂Ũ⁻¹)M^(T). Moreover, it can be shown that the decouplingproperty also holds when the desired BS employs the quasi-orthogonaldesign and the filters are designed by considering two consecutivesymbol intervals. Note that designing the MMSE filter by considering twoconsecutive symbol intervals implicitly assumes that no quasi-orthogonalinterferers are present, so the demodulation is done accordingly.

Next, we consider the efficient computation of the inverse {tilde over(V)}=Ũ⁻¹. Letting D₀=I₄, analogous to (18) and (19), it can be shownthat we can expand both Ũ, {tilde over (V)} as

$\overset{\sim}{U} = \begin{bmatrix}{\sum\limits_{i = 0}^{3}{D_{i}P_{11}D_{i}^{T}}} & \ldots & {\sum\limits_{i = 0}^{3}{D_{i}P_{1N}D_{i}^{T}}} \\\vdots & \ldots & \vdots \\{\sum\limits_{i = 0}^{3}{D_{i}P_{N\; 1}D_{i}^{T}}} & \ldots & {\sum\limits_{i = 0}^{3}{D_{i}P_{NN}D_{i}^{T}}}\end{bmatrix}$ ${\overset{\sim}{V} = \begin{bmatrix}{\sum\limits_{i = 0}^{3}{D_{i}Q_{11}D_{i}^{T}}} & \ldots & {\sum\limits_{i = 0}^{3}{D_{i}Q_{1N}D_{i}^{T}}} \\\vdots & \ldots & \vdots \\{\sum\limits_{i = 0}^{3}{D_{i}Q_{N\; 1}D_{i}^{T}}} & \ldots & {\sum\limits_{i = 0}^{3}{D_{i}Q_{NN}D_{i}^{T}}}\end{bmatrix}},$where {P_(ij), Q_(ij)}_(ij=1) ^(N) are now 4×4 matrices satisfying (22).The inverse computation can be done recursively using the formula in(23). The following observations greatly reduce the number ofcomputation involved.

First, utilizing the properties of the matrices {D_(i)} in (57), we canshow that the set

$\begin{matrix}{{{{\underset{\_}{Q\overset{\Delta}{=}}\left\{ {\sum\limits_{i = 0}^{3}{D_{i}{{AD}_{i}^{T}:{A \in {IR}^{4 \times 4}}}}} \right\}} = \left\{ {\sum\limits_{i = 0}^{3}{b_{i}{T_{i}:{\left\lbrack {b_{0},\ldots\mspace{11mu},b_{3}} \right\rbrack \in {IR}^{4}}}}} \right\}},\mspace{20mu}{{{where}\mspace{14mu} T_{0}} = I_{4}},{and}}\mspace{20mu}{T_{1} = {{\begin{bmatrix}0 & 1 & 0 & 0 \\{- 1} & 0 & 0 & 0 \\0 & 0 & 0 & {- 1} \\0 & 0 & 1 & 0\end{bmatrix}\mspace{14mu} T_{2}} = \begin{bmatrix}0 & 0 & {- 1} & 0 \\0 & 0 & 0 & {- 1} \\1 & 0 & 0 & 0 \\0 & 1 & 0 & 0\end{bmatrix}}}\mspace{20mu}\mspace{20mu}{T_{3} = {\begin{bmatrix}0 & 0 & 0 & {- 1} \\0 & 0 & 1 & 0 \\0 & {- 1} & 0 & 0 \\1 & 0 & 0 & 0\end{bmatrix}.}}} & (59)\end{matrix}$Thus Q is closed under the transpose operation and any matrix B ε Q isparametrized by four scalars. The matrices {T_(i)} have the followingproperties:T _(l) ^(T) =−T _(l) , T _(l) ^(T) T _(l) =I, 1≦l≦3T ₂ ^(T) T ₁ =T ₃ , T ₂ ^(T) T ₃ =−T ₁ , T ₁ ^(T) T ₃ =T ₂.   (60)Using these properties it can be verified that the set {±T_(i)}_(i=1) ⁸is closed under matrix multiplication and the transpose operation. Thefollowing two lemmas provide useful properties of the set Q. The proofsare similar to those of the previous two lemmas and hence are skippedfor brevity.

Lemma 3.A,B ε Q

AB ε QA=A^(T) ε Q

A=a₁I₄,   (61)for some scalar a₁ and

$\begin{matrix}{{{\sum\limits_{i = 0}^{3}{D_{i}{BD}_{i}^{T}}} = {{b_{1}I_{4}{\forall B}} = {B^{T} \in {IR}^{4 \times 4}}}}{{\left. {Q \in \underset{\_}{Q}}\Rightarrow{QQ}^{T} \right. = {q_{1}I_{4}}},}} & (62)\end{matrix}$for some scalars b₁, q₁.Thus for any A, B ε Q, the entire 4×4 matrix AB can be determined byonly computing any one of its rows (or columns). Further, the set of allnonsingular matrices in Q forms a matrix group under matrixmultiplication and is given by,

$\overset{\sim}{\underset{\_}{Q}} = {\left\{ {\sum\limits_{i = 0}^{3}{b_{i}{T_{i}:{\left\lbrack {b_{0},\ldots\mspace{11mu},b_{3}} \right\rbrack^{T} \in {{IR}^{4}\backslash O}}}}} \right\}.}$

The present invention has been shown and described in what areconsidered to be the most practical and preferred embodiments. It isanticipated, however, that departures may be made therefrom and thatobvious modifications will be implemented by those skilled in the art.It will be appreciated that those skilled in the art will be able todevise numerous arrangements and variations which, not explicitly shownor described herein, embody the principles of the invention and arewithin their spirit and scope.

1. A method for receiving data using a receiver equipped with multiplereceive antennas in a wireless channel, on which all dominanttransmitting sources transmit use inner codes of rate one symbol perchannel use, comprising steps of: estimating a channel matrix seen froma desired transmitter source among the dominant transmitting sources inresponse to a pilot or preamble signal; converting the estimated channelmatrix into an effective channel matrix responsive to the inner codeused by the desired transmitting source; collecting the signals receivedby the multiple receive antennas over four consecutive intervals;separating the real and imaginary parts of the collected receivedsignals and arranging them into a vector of real valued elements;estimating a covariance matrix of the noise plus interference in alinear model whose output is an equivalent of the received observationsand in which the effective channel matrix corresponding to each dominanttransmitting source inherits the structure of its inner code; computingand feeding back a signal-to-noise-plus-interference-ratio SINRresponsive to the covariance matrix and the effective channel matrixcorresponding to the desired source and a decoupling property comprisingthe relationship {tilde over (H)}₁ ^(T){tilde over (R)} ₁ ⁻¹{tilde over(H)}₁=α₁C₁+β₁C₃, where {tilde over (H)}₁ is the effective channel matrixcorresponding to the desired source (with index 1); {tilde over (R)} ₁is an estimate of said covariance matrix and {tilde over (R)} ₁ ⁻¹ isits inverse; {tilde over (H)}₁ ^(T) is the matrix transpose of {tildeover (H)}₁; α₁, β₁ are scalars that depend on {tilde over (H)}₁ and{tilde over (R)} ₁ , C₁ is the 8 times 8 identity matrix and C₃ is aparticular fixed matrix equal to $I_{2} \oplus \begin{bmatrix}0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \\1 & 0 & 0 & 0 \\0 & 1 & 0 & 0\end{bmatrix}$  where I₂ is the 2 times 2 identity matrix and {circlearound (×)} denotes the kronecker product.
 2. The method of claim 1,wherein the step of estimating the covariance matrix comprises usingsaid vectors as sample input vectors for a covariance matrix estimator.3. The method of claim 1, wherein the step of estimating the covariancematrix comprises a processing step such that the processed covariancematrix estimate has the property that all of its principal 8 times 8sub-matrices belong to a matrix group that is closed under the matrixtranspose operation and which is parameterized by 8 real valued scalars.4. The method of claim 1, wherein the step of estimating the covariancematrix comprises estimating channel matrices seen from one or moredominant transmitter sources not including the said desired transmittingsource in response to one or more pilots or preamble signals.
 5. Themethod of claim 1, wherein said transmitter source can have distributed(non co-located) physical transmit antennas.
 6. The method of claim 1,wherein said transmitter source is formed by two or more transmittersources which pool their physical transmit antennas and cooperativelytransmit a signal to the destination receiver.
 7. A method for receivingdata using a receiver equipped with multiple receive antennas in awireless channel, on which all dominant transmitting sources transmitusing inner codes of rate one symbol per channel use, comprising stepsof: estimating a channel matrix seen from a desired transmitter sourceamong the dominant transmitting sources in response to a pilot orpreamble signal; converting the estimated channel matrix into aneffective channel matrix responsive to the inner code used by thedesired transmitting source; collecting the signals received by themultiple receive antennas over two consecutive intervals; separating thereal and imaginary parts of the collected received signals and arrangingthem into a vector of real valued elements; estimating the covariancematrix of the noise plus interference from the dominant transmittingsources not including the said desired transmitting source; computingand feeding back a signal-to-noise-plus-interference-ratio SINRresponsive to the covariance matrix and the effective channel matrixcorresponding to the desired source and a decoupling property comprisingthe relationship {tilde over (H)}₁ ^(T){tilde over (R)} ₁ ⁻¹{tilde over(H)}₁=α₁I₄, where {tilde over (H)}_(l) is the effective channel matrixcorresponding to the desired source (with index 1); {tilde over (R)} ₁is an estimate of the said covariance matrix and {tilde over (R)} ₁ ⁻¹its inverse; {tilde over (H)}₁ ^(T) is the matrix transpose of {tildeover (H)}₁; α_(l) is a scalar that depends on {tilde over (H)}_(l) and{tilde over (R)} ₁ ; I₄ is the 4 times 4 identity matrix.
 8. The methodof claim 7, wherein the step of estimating the covariance matrixcomprises using the said vectors as sample input vectors for acovariance matrix estimator.
 9. The method of claim 7, wherein the stepof estimating the covariance matrix comprises estimating channelmatrices seen from one or more dominant transmitter sources notincluding the said desired transmitting source in response to one ormore pilots or preamble signals.
 10. The method of claim 7, wherein thestep of estimating the covariance matrix also comprises a processingstep such that the processed covariance matrix estimate has the propertythat all of its principal 4 times 4 sub-matrices belong to a matrixgroup that is closed under the matrix transpose operation and which isparameterized by 4 real valued scalars.
 11. A method for receiving datausing a receiver equipped with multiple receive antennas in a wirelesschannel, on which all dominant transmitting sources transmit using innercodes of rate one symbol per channel use, comprising steps of:estimating a channel matrix seen from a desired transmitter source amongthe dominant transmitting sources in response to a pilot or preamblesignal; converting the estimated channel matrix into an effectivechannel matrix responsive to the inner code used by the desiredtransmitting source; collecting the signals received by the multiplereceive antennas over four consecutive intervals; separating the realand imaginary parts of the collected received signals and arranging theminto a vector of real valued elements; estimating the covariance matrixof the noise plus interference from the dominant transmitting sourcesnot including the said desired transmitting source; computing a linearfilter and demodulating data responsive to the covariance matrix and theeffective channel matrix corresponding to the desired source and adecoupling property comprising the relationship {tilde over (H)}₁^(T){tilde over (R)} ₁ ⁻¹{tilde over (H)}₁=α₁C₁+β₁C₃, where {tilde over(H)}₁ is the effective channel matrix corresponding to the desiredsource (with index 1); {tilde over (R)} ₁ is an estimate of the saidcovariance matrix and {tilde over (R)} ₁ ⁻¹ is its inverse; {tilde over(H)}₁ ^(T) is the matrix transpose of {tilde over (H)}₁; α₁, β₁arescalars that depend on {tilde over (H)}_(l) and {tilde over (R)} ₁ , C₁is the 8 times 8 identity matrix and C₃ is a particular fixed matrixequal to $I_{2} \oplus \begin{bmatrix}0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \\1 & 0 & 0 & 0 \\0 & 1 & 0 & 0\end{bmatrix}$  where I₂ is the 2 times 2 identity matrix and {circlearound (×)} denotes the kronecker product.
 12. The method of claim 11,wherein the step of estimating the covariance matrix also comprises aprocessing step such that the processed covariance matrix estimate hasthe property that all of its principal 8 times 8 sub-matrices belong toa matrix group that is closed under the matrix transpose operation andwhich is parameterized by 8 real valued scalars.
 13. The method of claim11, wherein the step of demodulation comprising the joint detection offour complex quadrature amplitude modulation QAM symbols (or eight realpulse amplitude modulation PAM symbols) is split into four smaller jointdetection problems involving a pair of PAM symbols each.
 14. A methodfor receiving data using a receiver equipped with multiple receiveantennas in a wireless channel, on which all dominant transmittingsources transmit using inner codes of rate one symbol per channel use,comprising steps of: estimating a channel matrix seen from a desiredtransmitter source among the dominant transmitting sources in responseto a pilot or preamble signal; converting the estimated channel matrixinto an effective channel matrix responsive to the inner code used bythe desired transmitting source; collecting the signals received by themultiple receive antennas over two consecutive intervals; separating thereal and imaginary parts of the collected received signals and arrangingthem into a vector of real valued elements; estimating the covariancematrix of the noise plus interference from the dominant transmittingsources not including the said desired transmitting source; computing alinear filter and demodulating data responsive to the covariance matrixand the effective channel matrix corresponding to the desired source anda decoupling property comprising the relationship {tilde over (H)}₁^(T){tilde over (R)} ₁ ⁻¹{tilde over (H)}₁=α₁I₄, where {tilde over(H)}_(l) is the effective channel matrix corresponding to the desiredsource (with index 1); {tilde over (R)} ₁ is an estimate of the saidcovariance matrix and {tilde over (R)} ₁ is its inverse; {tilde over(H)}₁ ^(T) is the matrix transpose of {tilde over (H)}₁; α₁ is a scalarthat depends on {tilde over (H)}₁ and {tilde over (R)} ₁ ; I₄ is the 4times 4 identity matrix.
 15. The method of claim 14, wherein the step ofestimating the covariance matrix also comprises a processing step suchthat the processed covariance matrix estimate has the property that allof its principal 8 times 8 sub-matrices belong to a matrix group that isclosed under the matrix transpose operation and which is parameterizedby 8 real valued scalars.
 16. The method of claim 14, wherein the stepof demodulation comprising the joint detection of two complex quadratureamplitude modulation QAM symbols (or four real pulse amplitudemodulation PAM symbols) is split into four smaller joint detectionproblems involving one PAM symbol each.